Data Science - Song Popularity Prediction & Song Recommendation (Group I)

Statistical Analysis

Importing Packages

Reading Dataset

Correlation Coefficients and Plots

Machine Learning

Classification

Importing Packages

Start with fresh dataframe(s)

Establish training and testing sets

Assessing Classification Models

Parameter Tuning for Random Forest Classifier

Assessing Ensemble Methods

Reviewing ROC-AUC Curve and Precision-Recall Curve

Regression

Importing Packages

Start with fresh dataframe(s)

Establish training and testing sets

Assessing Regression Models

Parameter Tuning for RFR and GBR

The parameter tuning was completed over multiple iterations due to runtime. The results of previous iterations are included as comment in each section and therefore not included in recent outputs.

Random Forest Regressor
Saved runs for Random Forest tuning
Gradient Boosting Regressor
Saved runs for Random Forest tuning

Plotting CV Test Scores During Sample Parameter Tuning

Assessing Pairwise Ranking Accuracy

Assessing Testing Data

Capturing Feature Importance

Clustering

Import Packages

Start with fresh dataframe(s)

Sample Clustering

2-dimensional PCA Clustering